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Abstract 

In this paper, we develop a framework to design sensing matrices for compressive sensing applications 
that lead to good mean squared error (MSE) performance subject to sensing cost constraints. By capital- 
izing on the MSE of the oracle estimator, whose performance has been shown to act as a benchmark to 
the performance of standard sparse recovery algorithms, we use the fact that a Parseval tight frame is the 
closest design - in the Frobenius norm sense - to the solution of a convex relaxation of the optimization 
problem that relates to the minimization of the MSE of the oracle estimator with respect to the equivalent 
sensing matrix, subject to sensing energy constraints. Based on this result, we then propose two sensing 
matrix designs that exhibit two key properties: i) the designs are closed form rather than iterative; ii) 
the designs exhibit superior performance in relation to other designs in the literature, which is revealed 
by our numerical investigation in various scenarios with different sparse recovery algorithms including 
basis pursuit de-noise (BPDN), the Dantzig selector and orthogonal matching pursuit (OMP). 

I. Introduction 

The presence of redundancy in most signals in nature offers the means to transform the original 
signals into a compressed version convenient for storage and transportation. Compressive sensing (CS) is 
a new sampling paradigm that, instead of conforming to the traditional two-stage process involving signal 
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sampling followed by signal compression, directly acquires a compressed version of the original signal 
instead, by leveraging signal sparsity (a form of redundancy) as well as random sensing or measurement. 
In fact, it has been shown that if an n-dimensional signal admits an s-sparse representation then one 
can reconstruct exactly the original signal with m = 0{slog{n/s)) measurements m, ||2l; Also, if the 
original signal admits only a nearly sparse representation (and/or the measurements are corrupted by some 
noise) then one can still reconstruct the original signal subject to a tolerable distortion |II]> lUl- Therefore, 
CS offers the prospect of a more efficient signal acquisition in relation to traditional Shannon-Nyquist 
sampling, especially in applications where the sampling process is expensive such as magnetic resonance 
imaging 131 and data acquisition in wireless sensor networks Q. 

A recent growing trend relates to the use of more complex signal models that go beyond the simple 
sparsity model to further enhance the performance of CS. For example, Baraniuk et al. fSl have introduced 
model-based compressive sensing, where more realistic signal models such as wavelet trees or block 
sparsity are leveraged in order to reduce the number of measurements required for reconstruction. In 
particular, it has been shown that robust signal recovery is possible with m = 0{s) measurements in 
model-based compressive sensing Q. Ji et al. ||6l introduced Bayesian compressive sensing, where a signal 
specific statistical model is exploited to reduce the number of measurements needed for reconstruction. 
In Q, m, reconstruction methods have been proposed for manifold-based CS, where the signal is assumed 
to belong to a manifold. Other works that consider various sparsity models that go beyond simple sparsity 
in order to improve the performance of traditional CS include ll9l- |[T5]| . 

The use of additional signal knowledge also enables one to replace the conventional random sensing 
matrices by optimized ones in order to further enhance CS performance (e.g., see |[T6l - |[20l ). A number of 
conditions have been put forth to study the impact of the sensing matrices in various recovery algorithms. 
The null space property represents a necessary and sufficient condition for sparse recovery |I2]. However, 
it is difficult to verify whether or not a certain sensing matrix fulfills this condition. Other more widely 
used conditions include the restricted isometry property (RIP) [B, which is also difficult to evaluate, 
and the mutual coherence ||2TI . which is easier to evaluate. However, the fact that these conditions are 
mainly used to address the worst-case rather than the expected-case performance, renders their use as 
the basis of sensing matrix designs as too conservative. As such, Calderbank et al. ll22l have put forth a 
weaker version of the RIP, the statistical restricted isometry property (StRIP), where a probability criterion 
replaces the hard requirement demanded by RIP. StRIP has been also used as the basis of various sensing 
matrix designs presented in ll22l . 

In this paper, we develop a general framework to design sensing matrices for compressive sensing 
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applications that lead to good (expected-case) mean squared error (MSE) performance subject to sensing 
energy constraints, where the expectation is with respect to both the statistical distribution of the signal 
and the noise. We also leverage additional signal knowledge, by considering a general random signal 
model where the distinct support patterns of the same sparsity level occur with equal probability in the 
sparse representation of the original signal, and the autocorrelation matrix of the sparse representation 
is equal to an identity matrix. Our approach is based on the analysis of the oracle estimator MSE ll23l . 
whose performance has been shown to act as a benchmark to the performance of various common sparse 
recovery algorithms. By showing that good equivalent sensing matrices (that correspond to the product of 
the sensing matrix and the sparsifying dictionary) ought to be close to a Parseval tight frame, we are then 
able to put forth two new sensing matrix designs that conform to specific sensing energy constraints. Our 
experiments reveal that the proposed designs improve signal expected-case reconstruction performance 
in relation to random designs or other optimized designs llT6l - |[T8l . Another notable advantage of our 
proposed designs is that they are closed-form whereas the designs in |[T6l - |[T8l are iterative. 

Our design approach, which is applicable to signals that are sparse in any dictionary, shares some 
of the elements of the design approach in |[24l . which is only applicable to signals that are sparse 
in an orthonormal basis. In particular, this contribution - as does t2M - also capitalizes on the oracle 
estimator MSE to put forth adequate sensing matrix designs. However, this design approach also departs 
significantly from that in |[24l . in view of the fact that it is not clear how to generalize the methodology 
in |[24l from orthonormal to overcomplete dictionaries (namely. Propositions 1 and 2 in ll24l ). 

Therefore, the current generalization is based on two questions that are answered in the article. We 
first ask: 

1) What is the equivalent sensing matrix that leads to the lowest oracle estimator MSE for a certain target 
signal to noise ratio (SNR) at the input of the oracle estimator? 

Further, in view of the fact that a Parseval tight frame is likely to provide a low oracle MSE subject to 
a target SNR at the input of the oracle, we then ask: 

2) What is the sensing matrix that offers the best compromise between "sensing cost" and "closeness" 
of the equivalent sensing matrix to a Parseval tight frame? 

It is this angle-of-attack - which departs from that in ||24l - that enables us to generalize the sensing 
matrix designs for signals that are sparse in arbitrary overcomplete dictionaries. Interestingly, the ensuing 
designs are shown to reduce to the designs in ll24l when the dictionary is orthonormal rather than 
overcomplete. 

The generalization of the work from the orthonormal to overcomplete dictionary case is relevant 
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not only theoretically but also practically. For example, allowing signals to be sparse in overcomplete 
dictionaries adds a lot of flexibility and extends the range of applicability for CS Il25l - ll27l . Of particular 
relevance, this generalization also leads to further insight about the behavior of random vs. optimized 
projections: this is also crisply exposed in this contribution. 

The rest of this paper is organized as follows. We begin by describing the CS model and assumptions 
in Section II. Section III provides the rationale for the sensing matrix designs, by highlighting the role 
of Parseval tight frames in compressive sensing applications. Section IV puts forth our proposed sensing 
matrix designs, which capitalize on the intuition unveiled in Section III. Section V presents a range of 
numerical results that highlight the merits of our proposed designs in relation to other designs in the 
literature. Section VI discusses the MSB performance yielded both by random and optimized projections 
designs. The main contributions of the article are finally summarized in Section VII. 

Throughout this paper, signals are treated as real-valued vectors. Lower-case letters denote scalars, 
boldface upper-case letters denote matrices, bold face lower-case letters denote column vectors, and 
calligraphic upper-case letters denote support sets. and 1 denote a vector with all zeros and all ones, 
respectively, and Om.xn denotes an m x n matrix with all zeros. The superscripts (•)-^ and (O"^ denote 
matrix transpose and matrix inverse, respectively. The Iq norm, the ii norm, and the £2 norm of vectors, 
are denoted by || • ||o, || • ||l^ and || • II2, respectively. The Frobenius norm and spectral norm of a matrix A 
are denoted by ||A||ir and ||A||, respectively. The rank and trace of a matrix are denoted by rank(-) and 
Tr(-), respectively. The diagonal matrix with diagonal elements given by either vector a or the diagonal 
elements of matrix A is denoted by Diag(a) or Diag(A), respectively. The element corresponding to the 
ith row and 7'th column of the matrix A is denoted by aij, and aj denotes the ith column of the matrix 
A. In denotes the n x n identity matrix, and J„ denotes the n x n anti-diagonal matrix (an identity 
matrix with a reversed order of the columns (or rows)). Ej denotes the matrix that results from the 
identity matrix by deleting the set of columns out of the support J'. E(-) denotes the expectation, lEx(-) 
and Ej'(-) denote expectation with respect to the distribution of the random vector x, and the random 
support J', respectively. (^) denotes the number of m combinations from a given set of n elements. 
Pr(-) denotes the probability. Finally, A/'(/i, Xl) denotes the multivariate normal distribution with mean 
vector fi and covariance matrix H. 



II. Compressive Sensing Model 
We consider the standard measurement model given by: 

y = *f + n, 



(1) 
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where y G is the measurement signal vector, f E M" is the original signal vector, n ~ M{0, a'^lm) S 
is a zero-mean white Gaussian noise vector, and $ G j^mxn (y^^^ ^ < j^-) |-j,g sensing matrix. 
We assume that the original signal is sparse in some basis, i.e., 

f = *x, (2) 

where ^' G W^^^ (n > n) is a matrix that represents the sparsifying basis, e.g., an orthonormal or 
overcomplete dictionary, and x G is a sparse representation of f G M", i.e., ||x||o < s <^ h. Then we 
can rewrite the measurement model as 

y = **x + n = Ax + n, (3) 

where A = G W"^"^ represents the equivalent sensing matrix. For modeling the sparse sources, 
we assume i) the distinct support patterns of the same sparsity level occur with equal probability in the 
sparse representation of the original signal, i.e., Pr [J^c] = Pc, where C {1, . . . , n} = 1, . . . , ("), 
c = 1, . . . , s) denotes a signal support with cardinality c and X]c=i (c)-^c = 1; ii) Ex(xx-'^) = 1^. Note 
that these assumptions can be satisfied by a signal model akin to the widely used Bernoulli-Gaussian 
model |[28l - |[34l . In particular, one constrains the cardinality of the support patterns to be less than s, 
rather than ii; one also constrains the probability of the support patterns to obey X]c=i (c)^^ = 1 rather 
than a binomial distribution as in the Bernoulli-Gaussian model. 

To recover the sparse signal representation x from the measurement vector y, one can resort to the 
optimization problem: 

min ||x||i 

(4) 

s.t. ||Ax-y||2<e, 

where e is an estimate of the noise level. This program is also known as the basis pursuit de-noise 
(BPDN) 133. 

It has been established in ll36l that the now well-known RIP, which has been introduced by Candes and 
Tao ||37l . provides a sufficient condition for exact or near exact recovery of a sparse signal representation 
X from the measurement vector y via the £i minimization in ([U). 

Definition 1: A matrix A G M™^"^ satisfies the RIP of order s with a restricted isometry constant 
(RIC) 6s G (0, 1) being the smallest number such that 

(l-,5,)||x||i < ||Ax||2 < {l + 6s)Ml (5) 

holds for all x with ||x||o < s. 
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Theorem 1: The solution x* of ^ obeys 

||x* - x||2 < cis~"'"/^||x - Xslli + C2e, (6) 
where ci = , C2 = t^tW^^^^ is an approximation of x with all but the s-largest entries 

1 — (V2+i)02s 1 — (\/2+l)02s 

set to zero, and is the RIC of order 2s of matrix A. 

This theorem claims that the reconstructed signal representation x* is a good approximation to the 
original signal representation x. In addition, for the noiseless case, any sparse representation x with 
support size no larger than s, can be exactly recovered by minimization if the RIC satisfies ^25 < \/2— 1. 
Therefore, it follows that the RIP acts as a proxy to the quality of a sensing matrix. Note that the RIP 
is a sufficient condition for successful reconstruction but it may be too strict. It has been observed that 
signals with sparse representations can be reconstructed very well even though the sensing matrices have 
not been proven to satisfy the RIP lf22l . 

Another way to evaluate a sensing matrix, which is not as computationally intractable as the RIP, is 
via the mutual coherence of the matrix A, given by ED: 

11 = max |af aj|. (7) 

Donoho, Elad and Temlyakov |[2ll demonstrated that the error of the solution to (0]) is bounded if 
^ < 4^^. Therefore, mutual coherence can also be used to measure the quality of a sensing matrix. For 
example, various sensing matrix design approaches in the literature, such as Elad's method ||T6l . Duarte- 
Carvajalino and Sapiro's method IITtI . and Xu et al.'s method lITSl are inherently mutual coherence based 
approaches. 

III. Design Rationale 

We now provide a rationale for the proposed novel sensing matrix designs. The ultimate goal of the 
sensing matrix designs relates to the minimization of the MSE in estimating x from y, given by 

MSE(*) = Ex,n (||-F($*x + n) - x||2) , (8) 
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where denotes an estimator, subject to appropriate constraints (e.g., sensing energy costjH 

The derivation of such a sensing matrix design is very difficult though, because the average MSE in 
dHll depends upon the actual estimator. Consequently, to avoid the analysis of a single or several practical 
sparse recovery algorithms such as the BPDN, the Dantzig selector, or the OMP, we capitalize - as 
in |[24l - on the well-known oracle estimator that performs ideal least squares (LS) estimation based on 
prior knowledge of the sparse vector support J' C {1, . . . ,n} |[23l . The rationale of this approach is 
supported by the fact that the MSE of this oracle LS estimator coincides with the unbiased Cramer-Rao 
bound (CBD) for exactly s-sparse deterministic vectors PTI . so that it represents the best achievable 
performance for any unbiased estimator. Equally important, this approach is also supported by the fact 
that the oracle estimator MSE performance acts as a performance benchmark for the key sparse recovery 
algorithms. For example, Ben-Haim, Eldar and Elad B2l demonstrate both theoretically and numerically 
that the BPDN, the Dantzig selector, the OMP and thresholding algorithms all achieve performances that 
are proportional to the oracle estimator MSE. 

The oracle estimator MSE incurred in the estimation of a sparse deterministic vector x in the presence 
of a standard Gaussian noise vector n, according to the model in ([T}, is given by ||23tP: 



'We would also like to add that one could argue that it is preferable to consider the MSE associated with the estimation of f (the 
actual signal) from y rather than the MSE associated with the estimation of x (the signal sparse representation) from y. We use the 
more tractable MSE associated with the estimation of x from y because: 1) it can be shown that the MSE performance associated 
with the (oracle) estimation of x from y upper bounds in general the MSE performance associated with the (oracle) estimation of 
f from y . In particular, for an orthogonal dictionary, where ^ is an orthogonal matrix, ||f — f*||| — ||^x — ^x*||| = ||x — x*|||, 
where x* denotes the (oracle) estimate of x and f* — ^x* denotes the (oracle) estimate of f; for an overcomplete dictionary, 
where ^' is not an orthogonal matrix, ||f — f'Hl = ll^'x — ^'x*||| < A^-jax(^)!|x ~ ^'lli^ where Amax(^') is the largest 
singular value of ^; 2) it is also often desirable to manipulate or process the information content of signals in the sparse 
representation domain rather than the original observation domain, such as in feature extraction, pattern classification and blind 
source separation 1381 - 11401 . Therefore, the MSE performance associated with the estimation of x would be more appropriate 
than the MSE performance associated with the estimation of f for such applications. 

^Note that various works have adopted the oracle minimum MSE (MMSE) estimator in lieu of the oracle LS one in order to 
obtain a superior MMSE estimate I43I - I45I . The fact that we assume a signal model that does not specify the exact distribution 
of the sparse signal conditioned on the support - in contrast to 1431 - 145 1 that take the distribution of the sparse signal conditioned 
on the support to be multi-variate Gaussian - prevents us from exploiting this more powerful estimator. This approach however 
instils our projections design framework with more generality. 



MSE°''"^i^(A,x) =En (||^™^^i^(Ax + n) - x||^) 




(9) 
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Consequently, the average value of the oracle estimator MSE incurred in the estimation of a sparse 
random vector x in the presence of the Gaussian noise vector n is given by: 

MSE"^'^'^(A) = (j^Ej (Xr ((E^A^AEj)"^)) . (10) 



We define the coherence matrix of the equivalent sensing matrix as Q = A^A = We 
now pose the optimization problem: 

mm E^(Tr((E5QE^)"')) 
s.t. Q ^ 0, 

(11) 

Tr (Q) = m, 
rank{Cl) < m. 

It is relevant to reflect further on the rationale of this optimization problem. This optimization problem 
defines the coherence matrix of the equivalent sensing matrix - up to a rotation - that minimizes the 
average value of the oracle MSE subject to appropriate constraints: these include the obvious positive 
semi-definite and rank constraints on the coherence matrix and - at the heart of the novelty of the approach 
- a trace constraint on the coherence matrix that acts as a proxy to the sensed energy. 

In the noiseless case |[T6l - |[T8l . it is not common to place a constraint on the sensed energy because 
recovery is immune to the scaling of the sensing matrix; instead, it is only common to seek sensing 
matrices that exhibit adequate structure (e.g., llT6l uses t-averaged mutual coherence, iTTTl uses an 
equivalent sensing matrix whose Gram matrix is similar to an identity matrix, and lITSl uses an equivalent 
sensing matrix which is close to an equiangular tight frame, to seek for sensing matrices with adequate 
structure). 

In contrast, in the noisy case it is important to place a constraint on the sensed energy because recovery 
is affected both by the sensing matrix structure and immunity to noise. Therefore, the main features of 
our formulation include: 

1) The optimization problem defines equivalent sensing matrices with good structure and immunity 
to noise. 

2) The formulation is such that the sensed energy is directly proportional to the number of measure- 
ments. In fact, the sensed energy is given by: 

Ex (Tr (**xx'^*^*^)) = Tr (**Ex (xx^) *^*^) 

(12) 

Tr (***^*^) = m, 
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where we have used the fact that Ex (xx^) = In. Note that a modification of the constant of 

proportionality, which is equal to 1 here, scales only the solution to the optimization problem (fTTT l. 

3) The formulation is also such that the sensed SNR 

Ex (Tr (**xx^*^*'^)) 1 

E„ (Tr(nn^)) " ^' ^^^^ 

does not depend on m, n or n. 
We will see that in the presence of noise some of the "noiseless" sensing matrix designs in the literature 
can yield very poor recovery performance (see Section V). This is due to the fact that upon the nor- 
malization of the sensing matrix so that it conforms to a specific sensing cost constraint, the structural 
properties of the designs are offset by the poor noise immunity of the designs. The optimization problem 
formulation in (ITTI ) aims thus to attain a compromise between the structural and the noise immunity 
properties of the sensing matrix!^. 

The optimization problem (ITTI ) is non-convex owing to the rank constraint, and so is very difficult to 
solve. Therefore, we adopt an approach akin to that in ll24l : i) we first consider a convex relaxation of 
(fTTl) by ignoring the rank constraint; and ii) we then consider the feasible solution that is closest to the 
solution to the relaxed problem. This procedure produces a sub-optimal equivalent sensing matrix, but 
extensive simulation results demonstrate that this design outperforms various other designs. 

Proposition 1: The solution of the optimization problem: 

mm E^(Tr((E^QE^)-')) 

s.t. Q^O, (14) 
Tr (Q) = m, 

which represents a convex relaxation of the original optimization problem in ([TTI l. is the h x h matrix 

Proof: See Appendix A. ■ 

^Note that this optimization problem places a cost on the equivalent sensing matrix A = which translates into a constraint 
on the energy given to the estimator rather than a cost on the sensing matrix which translates into a constraint on the sensing 
energy. We recognize that a sensing energy cost is often more appropriate, but this is difficult to analyze in general. Therefore, 
our approach when the signal is sparse in a general overcomplete dictionaiy departs from that when the signal is sparse in an 
orthonormal dictionary [24]. In particular, we only incorporate the effect of sensing energy constraints into the design framework 
in Section IV. 
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It is evident that the solution to the convex relaxation of the original optimization problem is not 
feasible, because rank(^I^) = h > m. Therefore, we now propose to determine the m x n matrix A 
whose h X fi coherence matrix Q = A^A is closest to the ii x h matrix ^Ifi- 

Proposition 2: The solution of the optimization problem: 

min 
A 

s.t. Tr (A^A) = m, 

is the m X h Parseval tight frame. 

Proof: See Appendix B. ■ 
A frame in a finite-dimensional real space can be seen as a matrix A G such that for any vector 

z G M"", 

a\\z\\l < ||A^z||| < b\\z\\l, (16) 

where a > and 6 > are known as the frame bounds. Tight frames are a class of frames with equal 
frame bounds, i.e., a = b. A tight frame whose columns have unit £2 norm is called a unit norm tight 
frame. A tight frame whose frame bound is equal to 1, is called a Parseval tight frame. Note that any 
tight frame can be scaled by multiplying by so that the frame bound becomes equal to 1. 

Therefore, the value of the constraint of ([15) leads to a frame with a frame bound being equal to 1, 
and thus results in a Parseval tight frame. By scaling the value of Tr (A^A) in the constraint, which in 
fact alters the target sensing SNR in ([T3] ). it is clear that the solution of the optimization problem ([T5] ) 
is still a tight frame. Therefore, we can deduce that the tight frame represents a good equivalent sensing 
matrix design, in the sense that, among all equivalent sensing matrices that conform to the target sensing 
SNR, a tight frame is likely to produce a good MSE performance. Appendix C explores another facet of 
tight frames, including the relationship of a unit-norm tight frame to StRIP. 

Note that an alternative way to prove Proposition |2] which has been motivated by the optimization 
problem put forth by Duarte-Carvajalino and Sapiro lUTI . is also provided in |[T9l . The current problem 
differs from the problems in lITTI . ||T9l since our optimization approach is based on a metric with 
operational significance, the MSE, whereas the optimization approach in lITTl . |[T9l is based on mutual 
coherence. 

IV. Novel Sensing Matrix Design Approaches 

We now build upon the previous analysis, which suggests that A = ought to be close to a 
Parseval tight frame, to propose two sensing matrix designs for the compressive sensing model in In 



A^A - ^If, 
n 
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particular, in view of the fact that it is usual to place a constraint on the sensing energy cost ||^|||' = n, 
the design approaches strike a balance between two objectives: i) guaranteeing that the equivalent sensing 
matrix A = is as close as possible to a Parseval tight frame; and ii) guaranteeing that the sensing 
cost II $111^ is as small as possible. For example, for two different sensing matrices and such 
that = ^""ii is equal or close (e.g., in Frobenius norm sense) to some Parseval tight frame and 
||$'|||. < ||$"|||., it may be preferable to use instead of in the compressive sensing model in 
In fact, the normalization 

and 

then ensures that 

||*'*|||, > ||^"*|||' (19) 

and - via the previous analysis - eventually 

MSE°''"^''(*'*) < MSE°'"''=''=(*"*). (20) 

We note that this design approach, which is applicable to the noisy setting, is fundamentally different 
from the approaches in llT6l - |[T8l . which in contrast apply to the noiseless case. In particular, our design 
considers the sensing energy cost whereas the designs in l|T6l - |[T8ll do not. We will reveal the effect 
of taking into account the sensing energy constraint when we re-normalize the designs in llT6l - |[T8l . by 
showing the radically different performances in the presence of noise. 



(17) 



(18) 



A. Design Approach 1 

We now consider the first sensing matrix design approach, which explicitly performs a balance between 
the objective of guaranteeing that the equivalent sensing matrix is as close as possible to a Parseval tight 
frame against the objective of guaranteeing that the sensing energy cost is as small as possible. In 
particular, we pose the design problem: 





2 




2 


-B 


+ a 

F 




F 



where B G ]K"*x" is a specific target Parseval tight frame and a > is a specific scalar. The solution to 
the design problem is: 

* = B*"^ (**^ + al„) . (22) 
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In turn, the sensing matrix design, which is consistent with the sensing cost constraint ||$|||, = n, is: 

We note that the scalar a controls the weight for the energy penalty of the sensing matrix. If the penalty 
is not considered, i.e., a = 0, we have the sensing matrix design $ = y^^^^, -r^'f^.-l,, . In contrast, for 
a very high penalty, i.e., a — +00, we have the design $ = In both cases, i.e., a = or 

a — )■ +00, the sensing matrix $ turns out to be a unit norm tight frame if the basis * is an orthonormal 
matrix and the design target B is a tight frame with equal column norm, i.e., a scaled unit norm tight 
frame. We also note that, as will be shown later, the performance gain is greatly affected by the parameter 
a. In particular, one needs to use some empirical knowledge in order to set a suitable value for a. We 
next propose a sensing matrix design approach, that does not contain any adjustable parameters. 



B. Design Approach 2 

We now consider the second sensing matrix design approach, where the objective is to determine the 
matrix design with the lowest sensing energy cost that is consistent with the fact that the equivalent 
sensing matrix ought to be a Parseval tight frame. It will be shown that the ensuing design is instilled 
with operational significance, akin to the design in ll20l . We pose the design problem: 



mm 



(24) 

St. *1r*^$^ = I^. 

The following Proposition defines the solution to this optimization problem. We use the singular 
value decomposition (SVD) of the dictionary * = U^A^V|, where G M"^" and G R"^" are 
orthonormal matrices, and G M"^" is a matrix whose main diagonal entries (Af > Af" > . . . Ajf > 0) 
are the singular values of * and the other entries are zeros. We also use the SVD of the sensing matrix 
^ = U|,A|,V|, where U|, G M™''™ and V|, G M"''" are orthonormal matrices, and A|, G M'"''" is a 
matrix whose main diagonal entries (Af > Af > . . . > A* > 0) are the singular values of $ and the 
other entries are zeros. 

Proposition 3: A sensing matrix that solves the optimization problem in (l24l ) is given by 



* = U|,A|,J„Ui, (25) 



where U i is an arbitrary orthonormal matrix and As = Diag ( w , . . . , w ) O 



'mx (n—m) 
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Proof: Consider the SVD of the dictionary ^ = and the sensing matrix ^ = U|,A|, VT. 



Then the equivalent sensing matrix can be expressed as: 



= U|,A|,vTu^A^Vi;, 



(26) 



(27) 



and so the Parseval tight frame constraint in (|24l i can also be expressed as: 

A|,vTUv,Av,AiuiV4AT = I^. 

To satisfy the Parseval tight frame condition in (l27T i. it is clear that m columns of V|, have to correspond 
to m columns of Uij,. Since the remaining n — m columns of V|, do not affect the Parseval tight frame 
condition at all, then we take without any loss of generality V|, = U.i,n, where 11 € M"^" is a 
permutation matrix. Therefore, we can now rewrite the optimization problem as follows: 

|2 



mm 



A. 



s.t. A|,n^A^AinAT = i^, 

n is a permutation matrix. 
The solution to this optimization problem is trivially given by: 

n = J., 



(28) 



(29) 



and 



Diag ( Af , Af , . . . , A* j Omx(n-m.) 



. /J 1_ _]_\ 

Diag l\\l/'->vI/ '■■■'\^') ^mx(n-m) 



(30) 



Proposition |3] uncovers the key operations performed by this sensing matrix design. In particular, this 
sensing matrix design i) exposes the modes (singular values) of the dictionary; ii) passes through the m 
strongest modes and filters out the n — m weakest modes; and iii) weighs the strongest modes. This is 
accomplished by taking the matrix of right singular vectors of the sensing matrix to correspond to the 
matrix of left singular vectors of the dictionary and taking the strongest modes of the dictionary. 

Proposition [3] leads immediately to the sensing matrix design, which is consistent with the sensing cost 
constraint = n, as follows: 



1*1 



lA 



(31) 
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Note that the design approach 1 balances the requirements of guaranteeing that the equivalent sensing 
matrix is as close as possible to a Parseval tight frame against the requirements of guaranteeing that the 
cost is as small as possible; in the design approach 2, we force the equivalent sensing matrix to be a 
Parseval tight frame and minimize the sensing energy. Note also that the proposed designs are closed- 
form whereas other designs in the literature, such as Elad's method |[T6l . Duarte-Carvajalino and Sapiro's 
method ifTTl . and Xu et al.'s method ifTSl . are iterative. 



Finally, it is also interesting to note that design 1 and design 2 reduce to the design in 1241, i.e., to a 
tight frame, when we take the dictionary to be orthonormal rather than overcomplete. 

V. Performance Results 

We now compare the performance of the proposed sensing matrix designs to other designs in the CS 
setting. 

A. Distribution of the off-diagonal entries of the coherence matrix 

We first investigate the histogram of the absolute values of the off-diagonal entries of the coherence 
matrix In this investigation, we use a random dictionary ^ G ]^64x80 -^[^ entries drawn 

from i.i.d. zero mean and unit variance Gaussian distributions and then normalized to ||*|||' = 80. We 
also generate three sensing matrices $ G ]^40x64 y^jng the proposed approach 1 with a = 1 and a = 0.1, 
and using the proposed approach 2. We compare the performance of the proposed designs with a random 
Gaussian matrix design and with three iterative designs, namely, Elad's design |[T6l . Xu's design lITSl 
and Sapiro's design lITTI . 

It has been observed that coherence matrices with small off-diagonal entries result in good reconstruc- 
tion performance in accordance with the mutual coherence reconstruction condition |[T6l - |[T8l . Fig. [T] 
shows that the distributions of the off-diagonal entries in both designs based on approach 1 are better 
than that for the Gaussian matrix design. In particular, note that the design with a = 0.1 has off-diagonal 
entries with smaller absolute value than does the design with a = 1. However, ||$*|||. = 41.2639 for the 
a = 0.1 design is lower than ||$^|||, = 89.1929 for the a = 1 design - owing to the lower penalty used 
in the optimization problem in (1211 - and also lower than ||$^|||. = 84.5554 for the Gaussian design. 
This observation - via the analysis in Section III - ought to lead to poorer MSE performance of the 
design with a = 0.1 in relation to the design with a = 1 and also in relation to the Gaussian design. The 
distribution of the off-diagonal entries in the design based on approach 2 is also better than the Gaussian 
matrix. In addition, the sensing energy of the equivalent sensing matrix ||$*|||, = 89.1929 is not reduced 
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Fig. 1. Histogram of the absolute value of the off-diagonal entries of the coherence matrix. 



compared to the Gaussian matrix design. Elad's and Xu's designs, exhibit good mutual coherence but 
poor sensing energy. The attributes of Sapiro's design are equivalent to those of the design based on 
approach 2. Yet, our design is non-iterative whereas Sapiro's design follows an iterative procedure. 

The reconstruction performance of the proposed designs is further investigated in the following subsec- 
tions, both in terms of the MSE of the ideal oracle estimator as well as the MSE of practical estimators. 

B. The MSE performance using the oracle estimator 

In this investigation, we evaluate the MSE performance of various designs using the ideal oracle 
estimator, which has played a key role in the definition of our designs. The MSE is evaluated by averaging 
over 1000 trials, where in each trial we generate randomly a sparse vector with s randomly placed ±1 



The random dictionary ^ G ]^64x80 generated randomly by drawing its elements from i.i.d. 
zero mean and unit variance Gaussian distributions and then normalized to H^Hl' = 80. The parameter 
a is set to be equal to 1 for the design based on approach 1. 

Fig. |2] illustrates that the performance of our designs compare very well with that of the best iterative 
designs. A particularly relevant aspect relates to the sensing matrix normalization of iterative designs. 



''We have also performed this experiment and the following experiment with sparse vectors where the randomly placed non- 
zero elements follow a zero-mean unit-variance Gaussian distribution. Such experiments, which are not reported in view of 
space limitations, also demonstrate that our designs outperform other designs in the literature. 
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Fig. 2. MSB performance of different sensing matrices for tire oracle estimator (m = 40, n = 64 n = 80 and cr^ = lO^''). 
(a) Elad's, Xu's and Sapiro's designs are not normalized; (b) Elad's, Xu's and Sapiro's designs are normalized. 



Sapiro's design works very well with normalization but Elad's and Xu's designs do not. In fact, the MSE 
performance of Elad's and Xu's design is worse than that of the random Gaussian design, due to the 
lower sensing energy (see Fig. [1}. The proposed approach 2 has a better MSE performance than approach 
1, as the parameter a of approach 1, which is set empirically, affects the performance. 

C. The MSE performance using practical estimators 

In this investigation, we evaluate the MSE performance of various sensing matrix designs using practical 
estimators, which include the BPDN, the Dantzig selector and the OMR As in the previous investigation, 
the MSE is evaluated by averaging over 1000 trials, where in each trial we generate randomly a sparse 
vector with s randomly placed ±1 spikes. The random dictionary G ]^64x80 ^^^^ generated randomly 
by drawing its elements from i.i.d. zero mean and unit variance Gaussian distributions and then normalized 
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to 1 1 III' = 80. The parameter a is also set to be equal to 1 for the design based on approach 1. 




(a) BPDN (b) Dantzig selector (c) OMP 



Fig. 3. MSB performance of different sensing matrices for (a) tfie BPDN, (b) tfie Dantzig selector, and (c) the OMP (m = 40, 

n = 64 n = 80 and = 10""*). 

We first evaluate the MSE performance of various sensing matrix designs for various sparsity levels 
and for a fixed number of measurements, m = 40. Fig. [3] shows that the proposed design approach 1 
outperforms the Gaussian matrix design for all the three estimators. In turn, the proposed design approach 
2 outperforms all the other designs. In fact, this design is very attractive, due to the low computation 
cost associated with the generation of the sensing matrix. 




(a) BPDN (b) Dantzig selector (c) OMP 



Fig. 4. MSE performance of different sensing matrices for (a) the BPDN, (b) the Dantzig selector, and (c) the OMP (s = 10, 
n = 64 n = 80 and cr^ = 10""). 

We now evaluate the MSE performance of various sensing matrix designs for various numbers of 
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measurements and for a fixed sparsity level s = 10. Fig. |4] shows once again that the proposed designs 
outperform the Gaussian matrix design. We note that the proposed designs improve the reconstruction 
performance for all the three estimators, compared to the Gaussian matrix design. The iterative Elad's 
design, Sapiro's design, and Xu's design, slightly outperform the proposed designs in some cases, but 
the computation complexity associated with the generation of these designs is much higher than that 
associated with the generation of our design. 

D. The reconstruction performance for learned dictionaries in CS imaging 

We now assess the performance of the proposed designs by considering other practical issues. In 
particular, we consider real rather than synthetic signals whose representations are typically nearly sparse 
instead of sparse in some dictionary. We also consider learned dictionaries rather than random ones^. 

In the experiment, we use the cameraman image of size 256 x 256 pixels, which is partitioned into 
1024 nonoverlapping patches of size 8x8 pixels, i.e., n = 64. We train a dictionary of size 64 x 81 
for sparsely representing these nonoverlapping patches by using the K-SVD method The number 
of measurements for each patch is set to be equal to 40 and the measurements are corrupted by additive 
zero-mean Gaussian noise with variance cr^ = 10"^. We set a = 1 for the proposed approach 1. We 
also use the OMP to reconstruct the image from its noisy measurements owing to its fast execution. We 
evaluate performance using the reconstructed signal to noise ratio (RSNR): 

RSNR = , (32) 

l|f-f||2 

where f represents the original image and f represents the reconstructed image. 

Fig. |5] demonstrates the higher reconstruction quality and RSNR of our sensing matrix designs in 
relation to the random Gaussian matrix design. The proposed approach 2 exhibits the best performance. 
Sapiro's iterative design also exhibits a very good performance but Elad's and Xu's iterative designs with 
normalized sensing energy exhibit very poor performance, which in fact is worse than that for Gaussian 
matrix design. Interestingly we recall that the performance of the proposed two designs compare well to 
that of Gaussian matrix design for random basis and exactly sparse signals as shown in Fig. |3] and H] 

'We note that the dictionary learning process yields sparse representations that do not necessarily fit the statistical signal 
model that has been used as a basis of the sensing matrix design procedure. However, the value of the sensing matrix designs 
is also justified by the fact that it also yields observable gains in this scenario. 
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(a) Gaussian matrix design (b) Proposed approacli 1 (c) Proposed approach 2 

RSNR = 8.5197 RSNR = 9.5386 RSNR= 15.4729 




(d) Eiad's design with normalization (e) Xu's design with normalization (f) Sapiro's design with normaiization 

RSNR = 7.3595 RSNR = 7.0452 RSNR = 15.4537 



Fig. 5. Reconstructed images using a learned basis. 

VI. Discussion: Random vs. Optimized Projections 

Recent results P7l - P9l have established that - at least asymptotically with the signal ambient dimen- 
sion - no sensing or reconstruction strategy leads to essentially better performance than random sensing 
and standard ii based reconstruction. In contrast, our results indicate that a tight-frame based sensing 
matrix design can clearly outperform a random sensing matrix design for low signal ambient dimensions. 

It is thus interesting to ask whether our optimized designs can also outperform the random ones with 
an increase of the signal ambient dimension. This question is also justified by the fact that the recent 
contributions in the literature concentrate on signals that are sparse in the canonical basis rather than 
signals that are sparse in an overcomplete dictionary. Interestingly, the numerical analysis reveals that 
the trends applicable to overcomplete dictionaries can be distinct from those applicable to the canonical 
dictionary (and also orthonormal ones). 

The experiments also consider randomly generated sparse vectors with s randomly placed ±1 spikes. 
We consider both a random Gaussian sensing matrix design and an optimized sensing matrix design based 
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Fig. 6. Histogram of the singular values of a random overcomplete dictionary (n = 1000, h = 1200 and ~ n) and a 

A* 

specialized overcomplete dictionary (n = 1000, h = 1200, H^Hf = n and '|/ = 0.995). 



on approach 2 due to its low computational cost. The sensing matrix designs are normalized such that 
W^Wjp = n. We also consider three distinct dictionaries: i) the canonical basis; ii) a random overcomplete 
dictionary; and iii) a specified overcomplete dictionary. The random overcomplete dictionary is generated 
by drawing its elements randomly in accordance with i.i.d. zero-mean unit-variance Gaussian distributions. 
The specified overcomplete dictionary is generated via its singular value decomposition by taking two 
randomly generated orthonormal matrices and by taking its positive singular values Af > . . . > A* such 

A* 

that Xi = 1 and = 0.995 (i = 1, . . . , n — 1). Both overcomplete dictionaries are also normalized 
such that 11*111^ = n. 

The rationale for considering two different overcomplete dictionaries is because it is not entirely clear 
how to change the dictionary as the signal ambient dimension is variecj^. Therefore, two overcomplete 
dictionaries that exhibit a very different singular value profile as shown in Fig. |6] are chosen that will 
allow us to articulate different trends in the experiments. 

The MSB performance associated with the various sensing matrix designs is also averaged over 1000 
trials. We unveil the performance trends by showing how the ratio of the average MSB associated with 
the optimized sensing matrix design to the average MSB associated with a random sensing matrix design 
behaves as a function of the signal ambient dimension for various combinations of (m, s), both for the 
Dantzig selector and the oracle estimator. The signal dimension is restricted to n = 1000 due to the long 
execution time of the simulations. 



*Note that this issue is not relevant when the signal dimension is fixed as in the previous experiments (or for the canonical 
basis). 
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Fig. 7. Ratio of the average MSB associated with an optimized sensing matrix design to that associated with a random Gaussian 
sensing matrix design for signals that are sparse in the canonical basis (a^ — 10^*). 
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Fig. 8. Ratio of the average MSB associated with an optimized sensing matrix design to that associated with a random Gaussian 
sensing matrix design for signals that are sparse in a randomly generated overcomplete dictionarie (n — 1.2n and = 10~*). 

A. Case I: Signals that are sparse on the canonical basis 

Fig. |7] examines how the ratio of the average MSB associated with the optimized sensing matrix design 
to the average MSB associated with a random Gaussian sensing matrix design - which is a tight frame 
- behaves as a function of the signal dimension. One observes that the average MSB ratio tends to one 
with the increase of the signal dimension both for the oracle estimator and the Dantzig selector. This is 
due to the fact that a random Gaussian matrix tends to a tight frame with the increase of n for a fixed 

It turns out that this result is consistent with the result in BTl . where random sensing matrix designs 
are demonstrated to be near-optimal (asymptotically) for signals that are sparse in the canonical basis. 
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Fig. 9. Ratio of the average MSB associated with an optimized sensing matrix design to that associated with a random Gaussian 
sensing matrix design for signals that are sparse in a specified overcomplete dictionary (n = 1.2n and = 10^^). 

B. Case II: Signals that are sparse on an overcomplete dictionary 

Figs. [8] and |9] examine how the average MSB ratio behaves as a function of the signal dimension for 
the random and specified overcomplete dictionaries, respectively. One now observes that - and in sharp 
contrast to the canonical basis scenario - the average MSB ratio tends to increase with the increase of 
the signal dimension. This trend is exhibited by the oracle estimator for the pairs {m = 100, s = 5) and 
(m = 80, s = 10). The trend is also exhibited by the Dantzig selector for (?n = 100, s = 5) but not for 
(m = 80, s = 10): this exception seems to be due to severe reconstruction errors in view of the fact that 
one may not be satisfying the requirement m = 0{s\og{n/ s)) HI, Q. 

It is relevant though to point out a major difference in the behavior of the trends for the random and 
specified overcomplete dictionaries. For the random dictionary, the average MSB ratio appears to saturate 
with the increase of the signal dimension: this fact can be justified by noting that not only does the 
optimized design tends to a tight frame with the increase of n for a fixed m - because the m largest 
singular values of a random dictionary tend to be similar with the increase of n for a fixed m (see also 
Fig. 16]) - but also the random Gaussian matrix design also tends to a tight frame with the increase of 
n for a fixed m as discussed previously. In contrast, for the specified dictionary the average MSB ratio 
does not appear to saturate with the increase of the signal dimension. 

It turns out that such trends can also be partly reconciled with the arguments of the previous sections. 
In particular. Figs. [TO] and [TT] depict how the average sensed energy (i.e., the energy present at the 
input to the estimator) behaves as a function of the signal dimension for the random and the specified 
overcomplete dictionaries, respectively. Note that the average sensed energy Ex (Tr ($^'xx^'I'^$-^)) 
corresponds to the equivalent sensing matrix energy ||$*|||. in view of the fact that Ex(xx^) = I^. We 
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Fig. 10. Average sensed energy ||$^|||. for a random overcomplete dictionary (n = 1.2n). 
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Fig. 11. Average sensed energy ||$^|||. for a specialized overcomplete dictionary (n — 1.2n, ll^Hl? = n and = 0.995). 



would like to emphasize that for both Figs. [TOl and [TT] the sensing matrix designs have been normalized 
such that 11*111^ = n. 

One observes clearly that the optimized designs have the capability to "sense" higher energy than the 
random ones in the presence of overcomplete dictionaries (both the random and the specified overcomplete 
dictionary) and - via the analysis in Section III - potentially have the capability to offer a lower MSB 
(as confirmed in Figs. [8] and |9l). Figs. [TO] and [TT] also confirm that for the random dictionary the sensed 
energy tends to saturate with the increase of the signal dimension but for the specified dictionary it does 
not. 

We recognize that this analysis is mainly heuristic: a proper understanding of the advantages of designed 
projections over random ones in the presence of signals that admit sparse representations in overcomplete 
dictionaries is beyond the scope of this article. However, the practical relevance of the overall results - 
independently of whether or not it can be crisply shown that optimized projections clearly outperform 
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random ones for high ambient dimensions - is also associated with the fact that in some appUcations it is 
typical to deal with small dimensions. For example, in certain imaging applications it is standard practice 
to divide an image into various (possibly overlapping) patches of typically small dimensions ||46l . The 
results then show that there is indeed significant value in using optimized projections in lieu of random 
ones. 



In this paper, we have considered the design of sensing matrices for CS applications. By showing that 
one ought to set the equivalent sensing matrix to be equal to a tight frame in order to derive a good MSB 
performance subject to sensing energy constraints, we have proposed two sensing matrix designs that are 
instilled with operational significance. Our designs also exhibit various advantages in relation to other 
designs in the literature. In particular, the proposed designs exhibit MSB performance gains in relation 
to the conventional random sensing matrix designs as well as other optimized designs. The proposed 
designs are also closed-form, and as a result easy to generate, whereas other optimized designs in the 
literature are typically iterative. 
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Appendix A 
Proof of the Proposition [T] 

This proof follows the ideas of the proof of Proposition 1 in Il24l . Let s < s be a positive integer. Let 
also J- C {1 , 2, . . . , h} (t = 1 , . . . , Tg) denote a support set with cardinality s, where Tg = (") = j^^rfjr- 
We let Djt = Ei^tQEjt. We also let xf>...> be the eigenvalues of Bj*. Let Pr(|| J^Hq = s) 
denote the probability that the support size of J' is s. 

We now note that 



VII. Conclusions 
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By the arithmetic mean - harmonic mean inequaUty, it follows that: 

where one achieves the lower bound with = ^ (n = 1, . . . , s; i = l,...,Ts). This implies 
immediately that the matrix Q = ^In, which is consistent with the constraints, minimizes: 

%,(Tr((D^,r'))=^EET^. (35) 

t=l n=l 

and hence also minimizes: 

s 

(Xr ((D^)"')) = EP'-dl^llo = (Tr ((D^^)"')) " ^^^^ 

Appendix B 
Proof of the Proposition |2] 

This proof follows the ideas of the proof of Proposition 2 in Il24l . By using the SVD A = UaAaV^, 
where Ua G k™x'" and Va G M"^" are orthonormal matrices, and Aa G k™x" is a matrix whose 
main diagonal entries {X^ > > . . . > 0) are the singular values of A, and the off-diagonal entries 
are zeros, we pose the convex optimization problem: 

.^2 mx2 



mm 

1=1 



1=1 

m 

s.t. J](Af)' = m, Af >0 (i = l,2,...,m). 



(37) 

A\2 



Which, in view of the fact that ||A^A - '^Infp = ET=i ((^f )' - f )' and Tr (A^A) = ZT=i (^f)'' 
leads to the solution of ([T5] ). Since the solution of dTT] ) is A^ = 1 (i = 1, 2, . . . , m), it follows that any 
Parseval tight frame is the solution of ([TSll . 

Appendix C 
Tight frames and StRIP 

Another benefit of tight frames - more precisely, unit-norm tight frames - is its relation to the weaker 
version of the RIP, namely the StRIP. The StRIP, which has been proposed by Calderbank et al. ||22| . can 
be used to evaluate the expected-case performance of CS, whereas the RIP is a worst-case performance 
indicator, as is the mutual coherence. The StRIP guarantees successful reconstruction of all but an 
exponentially small fraction of s sparse signals. The definition of StRIP uses a probability criterion 
to replace the hard requirement demanded in the definition of the RIP. 
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Definition 2: A matrix A G IK™X" is said to be an (s, 6, r/)-StRIP matrix if for s sparse vectors x G M" 
the inequalities 

{l-d)Ml < ||Ax||2 < {l + 6)Ml (38) 

hold with probability exceeding 1 — rj with respect to a uniform distribution of the vectors x among all 
s sparse vectors in M" having the same fixed magnitudes. 

Calderbank et al. Il22l demonstrate that deterministic sensing matrices are StRIP matrices if they satisfy 
all of the following criteria: 

1) The rows of A are orthogonal and all the row sums are zero, i.e., J2^=i ^ — 0; 

2) The columns of A form a group under point-wise multiplication; 

3) There is one column of A equal to 1, which can be assumed as the first column. For all i € 
{2, . . .,fi}, \\sii\\l < m'^-l^, where < /3 < 1. 

A large class of matrices, including discrete chirp sensing matrices, Bose, Chaudhuri, and Hocquenghem 
(BCH) sensing matrices, Kerdock, Delsarte-Goethals and second order Reed Muller sensing matrices, 
satisfy these criteria, and thus are StRIP matrices. In ||22| . they prove that the RIP of these matrices is 



satisfied with a probability exceeding l — O ^exp y—^-j^jj- However, a unit norm tight frame does not 

necessarily satisfy these criteria. For example, an orthonormal matrix, which is also a unit norm tight 

frame with m = n, does not necessarily satisfy Y^^=i ^ — ^■ 

The following Proposition demonstrates that a unit-norm tight frame is also a StRIP matrix 
Proposition 4: Let A G ]^"ix" be a unit norm tight frame with mutual coherence equal to /i. For any 

s sparse vectors x G R", the RIP holds with probability: 

s (0.38945- -^)^ 

< (5||X||2) > 1 - (s/2)"36^"»'°'Se{lT»/2)^ (39) 



I A l|2 _ II ||2 
1-^^112 11^112 



where ^/237A2jJ7hgjrT7M + ^ < S < 1. 

Proof: Let Aj^ £ M™^'^ be an s-column submatrix of A G M'"^?! (g < < n), where Js C 
{!,..., n} denotes a support set with cardinality s. Let xf'^"'^^" > ••• > \f'^°'^^° > be the 



eigenvalues of the positive semi-definite matrix Aj A j^. We have that the maximum eigenvalue Aj^ 
and minimum eigenvalue xf''^" ^"^^ of A^ A are given by 



and 



Af-^-=max^^, (40) 

z^O Z ^ 



xAj.Aj, . \\Aj^z\\i 

As " = mm — —5 — , (41) 

z^O Z o 
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where z G M*. Therefore, we have 

< \\Ajz\\'i < Ar=""1|z||^, (42) 



V AJ.Aj, 11^1,2 / II A _ „||2 / \ A;^, 11^1,2 



or 



(Af - l)||z||2 < ||A^,z||i - ||z||2 < (Af^-^"= - l)||z||i, (43) 



for any z e M*. By defining 5j^= max 



, it follows that 



|l|Aj,z||^ - ||z||^| < 6jJ\z\\l (44) 

We can immediately derive that the RIC satisfies 6s = maxj-, by comparing (l44l ) with dD. 

The following theorem, which has been proved by Tropp in lISTI , defines a probability bound for 6 . 

Theorem 2: Theorem Let A G j^mxn < n) be a matrix whose columns have unit norm, i.e., 
Il^ilb = 1 for all I G {1, . . . ,n}, Aj^ G M™^* (s < m) be a random s-column submatrix of A with a 
support Js C {1, . . . , n} of cardinality s, and /x be the mutual coherence of A. Suppose that 

Vl44/i2slogg(l + s/2)p + ^11 Af < e-°-25<5, (45) 

n 

where p > 1 and < 5 < 1. Then 

Pr(5j. > 5) < (46) 

We now use the fact that A is a unit norm tight frame with frame bound equal to ^, so that || A|p = ^. 
We then rewrite (l45T l to be 

(e-o.25j_ 2.^2 

l< o< — — (47) 

- ^ - 144//2slog^(l + s/2)' ^ ^ 

where -^144e0-^/i2slogg(l + s/2) + ^^^^ < 5 < 1. Since the inequahty (l46l ) holds for any p satisfying 

(|45] ). we have that for a random set Js, the inequality (l44l ) leads to the probability bound 

P(|||A^,z||i-||z||i| <5||z||2) >¥{5j^<5) 

>1 - (s/2)~3ef"=l°8e(l + = /2) ^ 

when ^237.42p2slogg(l + s/2) + ^ < 5 < 1. ■ 
Remark 1: The mutual coherence n plays as an important role in this probability bound. The mutual 
coherence of various unit norm tight frames could be different, and its distribution is unknown. However, 
the mutual coherence is fixed for some specific unit norm tight frames. For example, the Fourier-Dirac 
tight frame has mutual coherence p = [51], and the equiangular tight frame has mutual coherence 



(48) 
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Remark 2: In |[22l . the authors conclude that the RIP is satisfied with a probability exceeding 1 — 
O ^exp ^— ^-f^^^ (0 < /3 < 1) for a large class of matrices, including discrete chirp sensing matrices, 
Bose, Chaudhuri, and Hocquenghem (BCH) sensing matrices, Kerdock, Delsarte-Goethals and second 
order Reed MuUer sensing matrices. According to Proposition |4l we can conclude that the RIP holds with 
a probabiUty that exceeds l — O ^exp (y—^-p-^^ for Fourier-Dirac tight frames and the equiangular tight 
frame, so that these tight frames exhibit better quality in terms of the StRIP in relation to the sensing 
matrices in ll22l . 

Remark 3: Proposition |4] requires unit norm tight frames with frame bound ^. It turns out that, one 
can scale a unit norm tight frame via A = jj^'^ , which leads to a Parseval tight frame with an equal 
column norm, in order to achieve the frame bound equal to 1 used in the paper In fact, scaling the unit 
norm tight frames does not change the matrix structure, only the sensing energy. 
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